{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9 多层感知机的从零开始实现\n",
    "我们已经从上一节里了解了多层感知机的原理。下面，我们一起来动手实现一个多层感知机。首先导入实现所需的包或模块。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.1.0\n"
     ]
    }
   ],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "import sys\n",
    "sys.path.append(\"..\") # 为了导入上层目录的d2lzh_tensorflow\n",
    "import d2lzh_tensorflow2 as d2l\n",
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9.1 获取和读取数据\n",
    "这里继续使用Fashion-MNIST数据集。我们将使用多层感知机对图像进行分类"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow.keras.datasets import fashion_mnist\n",
    "(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()\n",
    "batch_size = 256\n",
    "x_train = tf.cast(x_train, tf.float32)\n",
    "x_test = tf.cast(x_test, tf.float32)\n",
    "x_train = x_train/255.0\n",
    "x_test = x_test/255.0\n",
    "train_iter = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(batch_size)\n",
    "test_iter = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(batch_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9.2 定义模型参数\n",
    "我们在3.6节（softmax回归的从零开始实现）里已经介绍了，Fashion-MNIST数据集中图像形状为 28×28，类别数为10。本节中我们依然使用长度为 28×28=784 的向量表示每一张图像。因此，输入个数为784，输出个数为10。实验中，我们设超参数隐藏单元个数为256。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
    "W1 = tf.Variable(tf.random.normal(shape=(num_inputs, num_hiddens),mean=0, stddev=0.01, dtype=tf.float32))\n",
    "b1 = tf.Variable(tf.zeros(num_hiddens, dtype=tf.float32))\n",
    "W2 = tf.Variable(tf.random.normal(shape=(num_hiddens, num_outputs),mean=0, stddev=0.01, dtype=tf.float32))\n",
    "b2 = tf.Variable(tf.random.normal([num_outputs], stddev=0.1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9.3 定义激活函数\n",
    "这里我们使用基础的max函数来实现ReLU，而非直接调用relu函数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def relu(x):\n",
    "    return tf.math.maximum(x,0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def net(X):\n",
    "    X = tf.reshape(X, shape=[-1, num_inputs])\n",
    "    h = relu(tf.matmul(X, W1) + b1)\n",
    "    return tf.math.softmax(tf.matmul(h, W2) + b2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9.5. 定义损失函数¶\n",
    "为了得到更好的数值稳定性，我们直接使用Tensorflow提供的包括softmax运算和交叉熵损失计算的函数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss(y_hat,y_true):\n",
    "    return tf.losses.sparse_categorical_crossentropy(y_true,y_hat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.9.6. 训练模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_epochs, lr = 5, 0.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch 1, loss 0.7892, train acc 0.703, test acc 0.818\n",
      "epoch 2, loss 0.4805, train acc 0.822, test acc 0.840\n",
      "epoch 3, loss 0.4183, train acc 0.844, test acc 0.851\n",
      "epoch 4, loss 0.3853, train acc 0.857, test acc 0.857\n",
      "epoch 5, loss 0.3607, train acc 0.867, test acc 0.864\n"
     ]
    }
   ],
   "source": [
    "num_epochs, lr = 5, 0.5\n",
    "params = [W1, b1, W2, b2]\n",
    "d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
